Systems and methods for generating partial indexes in distributed databases

ABSTRACT

According to one aspect, methods and systems are provided for creating partial indexes in a distributed database environment. The database includes an index engine configured to receive at least one index field, a criteria field, and a criteria condition, wherein the criteria field is not included in the at least one index field; and generate an index comprising the at least one index field from at least one record of the plurality of records and a pointer to the at least one record of the plurality of records, wherein the criteria field of the at least one record of the plurality of records satisfies the criteria condition. The database further includes a query engine configured to receive a search query containing the at least one index field; and search the index for the at least one index field.

RELATED APPLICATIONS

This Application claims priority under 35 U.S.C. § 119(e) to U.S.Provisional Application Ser. No. 62/341,475, entitled “SYSTEMS ANDMETHODS FOR GENERATING PARTIAL INDEXES IN DISTRIBUTED DATABASES” filedon May 25, 2016, which is herein incorporated by reference in itsentirety.

NOTICE OF MATERIAL SUBJECT TO COPYRIGHT PROTECTION

Portions of the material in this patent document are subject tocopyright protection under the copyright laws of the United States andof other countries. The owner of the copyright rights has no objectionto the facsimile reproduction by anyone of the patent document or thepatent disclosure, as it appears in the United States Patent andTrademark Office publicly available file or records, but otherwisereserves all copyright rights whatsoever. The copyright owner does nothereby waive any of its rights to have this patent document maintainedin secrecy, including without limitation its rights pursuant to 37C.F.R. § 1.14.

BACKGROUND

Technical Field

The present invention relates to distributed database systems andmethods for generating database indexes to improve database performance.

Background Discussion

Database indexes in currently available database systems are datastructures that store copies of certain columns (or fields) of data inthe database. The database index can be used to speed databaseoperations involving those index columns, because it avoids the time-and resource-intensive act of sequentially loading and examining everyfield in every record of data. Partial indexes databases are availablein some relational database systems wherein the index only includesentries for those rows that match an index column and a definedcriteria.

SUMMARY

According to various embodiments, indexes support the efficientexecution of queries in any database (e.g., relational, no-SQL, dynamicschema, or an unstructured database (including the known databaseMONGODB)). For example, without indexes, a MongoDB database system wouldbe required to perform a collection scan (i.e. a scan of every documentin a collection (logical group of documents)) to select those documentsthat match a query statement made on a collection in the database. If anappropriate index exists for a query, a search process can find theindex and use the index to limit the number of documents that the searchprocess must inspect to respond to the query.

According to some embodiments, in non-relational environments indexescan be implemented as special data structures that store a small portionof the collection's data set in an easy to traverse form. A collectionrefers to a logical grouping of documents. Documents represent a baseunit of data storage that contain fields and values. In non-relationalor dynamic schema environments, rigid rules for the structure of thedocument are not enforced (unlike relational tables where data entriesmust conform to row and column definitions). These properties make theimplementation of indexes and partial indexes more challenging than inrelational architectures, where the data's format is known in advance.

In some embodiments, indexes are stored as documents that include asmall portion of the collection's data set in an easy to traverse form.For example, the index stores the value of a specific field or set offields, ordered by the value of the field. The ordering of the indexentries also supports efficient equality matches and range-based queryoperations. In addition, the search process can return sorted results byusing the ordering in the index.

One drawback of some conventional database indexes is that any criteriafor inclusion in the index cannot be based on fields not included in theindex. In non-relational architectures, including filters based onnon-indexed fields is especially challenging. The creation of the indexmust account for the potential of any unstructured document in acollection to contain the index field, but also must account for matcheson any of the documents in the collection based on the filer condition.Various embodiments perform these operations without the benefit of astrict schema that defines which tables include the indexed field in thefirst instance.

For example, a user of a database for processing ecommerce orders maywish to create an index on a single order_id field, thereby facilitatingfaster retrieval of records according to their order ID. Yet there maybe thousands or millions of historical orders stored in the system,whereas the user is interested only in the relatively small number ofunfulfilled orders in existence at a given time. Such conventionalindexes would not allow the user to specify criteria that a documentshould be included in the index according to any field other thanorder_id, for example, by specifying that documents included in theindex must have a status field equal to “open.” The alternative is toinclude fields in the index that, while not helpful to indexing andlocating records, are used as criteria for inclusion in the index. Theinclusion of such extraneous fields in the index may adversely affectthe performance of searches relying on the index, thereby diluting theusefulness of the index.

There is therefore a need for a system and method for providing adatabase index that can be based on criteria fields not included asindex fields for the database. The well-known MongoDB database system isan example of a “No SQL” database that can benefit from integration ofpartial indexes, in which one or more conditions are applied based onnon-indexed fields so that the index represents a subset of documentsstored in a database or database collection.

In other examples, a database system can include replica sets, in whichmultiple nodes contain replicated data. For instance, individual membernodes may respond to database operation requests (e.g., read and writeoperations) directed to the replica set, allowing for scalability, aswell as redundancy in the event of a node failure.

Various aspects are directed to database systems that manage and/orprovide for generation of partial indexes. In one example, partialindexes can be particularly challenging in a database having anon-relational or dynamic database structure.

Stated broadly, various aspects provide for the creation of a partialdatabase index using a criteria field distinct from any index fields.According to an embodiment, management interfaces and processes areprovided that enable such filtered partial indexes, thereby reducing anumber of steps to be performed to create a filtered index, andaccordingly reducing processing time and the opportunity for error.According to various implementations, the partial index can also improvequery efficiency, for example, based on a reduce size of anycorresponding partial index. In other embodiments, a larger number ofpartial indexes can be maintained in memory than larger conventionalindexes, improving the database response capability.

According to one aspect a database system is provided. The databasesystem comprises a database for storing a plurality of database records,an index engine configured to receive at least one index field, acriteria field, and a criteria condition, wherein the criteria field isnot included in the at least one index field, and generate an indexcomprising the at least one index field from at least one record of theplurality of records and a pointer to the at least one record of theplurality of records, wherein the criteria field of the at least onerecord of the plurality of records satisfies the criteria condition, aquery engine configured to: receive a search query containing the atleast one index field; and search the index for the at least one indexfield, according to one embodiment, wherein the plurality of databaserecords are a collection of documents in a non-relational database.

According to one embodiment, the index engine is further configured toidentify, in the collection of documents, at least one documentconfigured to store the criteria field. According to one embodiment, theindex engine is further configured to resolve the criteria conditionwhich comprises a comparison operator and a comparison value. Accordingto one embodiment, the index engine is further configured to resolve thecriteria condition based on the comparison operator selected from agroup consisting of a greater-than operator, a less-than operator, anequals operator, and a does-not-equal operator. According to oneembodiment, the index engine is further configured to resolve thecriteria condition wherein the criteria condition comprises a logicaloperator for determining whether the criteria field is set in the atleast one record of the plurality of records.

According to one aspect a method for creating a database index for aplurality of database records is provided. The method comprises acts of,receiving at least one index field to be included in the database index,receiving a criteria field and a criteria condition, the criteria fieldnot included in the at least one index field, and generating an indexcomprising the at least one index field from at least one record of theplurality of records and a pointer to the at least one record of theplurality of records, wherein the criteria field of the at least onerecord of the plurality of records satisfies the criteria condition.

According to one embodiment, the plurality of database records are acollection of documents in a non-relational database. According to oneembodiment, the method further comprises identifying, in the collectionof documents, at least one document configured to store the criteriafield. According to one embodiment, the criteria condition comprises acomparison operator and a comparison value. According to one embodiment,the comparison operator is selected from the group consisting of agreater-than operator, a less-than operator, an equals operator, and adoes-not-equal operator. According to one embodiment, the criteriacondition comprises a logical operator for determining whether thecriteria field is set in the at least one record of the plurality ofrecords.

Still other aspects, embodiments, and advantages of these exemplaryaspects and embodiments, are discussed in detail below. Any embodimentdisclosed herein may be combined with any other embodiment in any mannerconsistent with at least one of the objects, aims, and needs disclosedherein, and references to “an embodiment,” “some embodiments,” “analternate embodiment,” “various embodiments,” “one embodiment” or thelike are not necessarily mutually exclusive and are intended to indicatethat a particular feature, structure, or characteristic described inconnection with the embodiment may be included in at least oneembodiment. The appearances of such terms herein are not necessarily allreferring to the same embodiment. The accompanying drawings are includedto provide illustration and a further understanding of the variousaspects and embodiments, and are incorporated in and constitute a partof this specification. The drawings, together with the remainder of thespecification, serve to explain principles and operations of thedescribed and claimed aspects and embodiments.

BRIEF DESCRIPTION OF THE DRAWINGS

Various aspects of at least one embodiment are discussed herein withreference to the accompanying figures, which are not intended to bedrawn to scale. The figures are included to provide illustration and afurther understanding of the various aspects and embodiments, and areincorporated in and constitute a part of this specification, but are notintended as a definition of the limits of the invention. Where technicalfeatures in the figures, detailed description or any claim are followedby reference signs, the reference signs have been included for the solepurpose of increasing the intelligibility of the figures, detaileddescription, and/or claims. Accordingly, neither the reference signs northeir absence are intended to have any limiting effect on the scope ofany claim elements. In the figures, each identical or nearly identicalcomponent that is illustrated in various figures is represented by alike numeral. For purposes of clarity, not every component may belabeled in every figure.

In the figures:

FIG. 1 illustrates a block diagram of an example architecture for adatabase according to aspects of the invention;

FIG. 2 illustrates a representation of the data stored in a database andan index according to aspects of the invention;

FIG. 3 illustrates an example process flow for encrypting a databaseaccording to aspects of the embodiment;

FIG. 4 illustrates a block diagram of an example architecture for adatabase replica set, according to aspects of the invention;

FIG. 5 is a block diagram of an example distributed database system inwhich various aspects of the present invention can be practiced;

FIG. 6 is a block diagram of an example distributed database system inwhich various aspects of the present invention can be practiced;

FIG. 7 is a block diagram of an example distributed database system inwhich various aspects of the present invention can be practiced; and

FIG. 8 illustrates an example query on indexed data, according to oneembodiment.

DETAILED DESCRIPTION

A system and method are provided for creating a partial index for adatabase in which criteria fields other than the index fields can beused to filter which record should be included in the index. Inparticular, a criteria field from at least one record and a criteriacondition are provided as part of a request to create a partial index,so-called because it contains entries for only those records for whichthe criteria field satisfies the criteria condition. Records aresearched, and the index fields from those records satisfying thecriteria are included in the partial index. By allowing the criteria forinclusion to depend on fields other than the index fields, indexes canbe optimized for the particular arrangement and usage of a database,without regard to what fields are included in the index.

Examples of the methods, devices, and systems discussed herein are notlimited in application to the details of construction and thearrangement of components set forth in the following description orillustrated in the accompanying drawings. The methods and systems arecapable of implementation in other embodiments and of being practiced orof being carried out in various ways. Examples of specificimplementations are provided herein for illustrative purposes only andare not intended to be limiting. In particular, acts, components,elements and features discussed in connection with any one or moreexamples are not intended to be excluded from a similar role in anyother examples.

Also, the phraseology and terminology used herein is for the purpose ofdescription and should not be regarded as limiting. Any references toexamples, embodiments, components, elements or acts of the systems andmethods herein referred to in the singular may also embrace embodimentsincluding a plurality, and any references in plural to any embodiment,component, element or act herein may also embrace embodiments includingonly a singularity. References in the singular or plural form are notintended to limit the presently disclosed systems or methods, theircomponents, acts, or elements. The use herein of “including,”“comprising,” “having,” “containing,” “involving,” and variationsthereof is meant to encompass the items listed thereafter andequivalents thereof as well as additional items. References to “or” maybe construed as inclusive so that any terms described using “or” mayindicate any of a single, more than one, and all of the described terms.

An example of a database system 100 is shown in FIG. 1. The databasesystem 100 illustrates a system in which data may be stored andretrieved. For example, the database system 100 may be a standalonedatabase, or may be a primary node or a secondary node within a replicaset, wherein particular data is stored by more than one node to ensurehigh availability and stability in the event that one or more nodesbecomes unavailable for some period of time. In other embodiments, thedatabase system 100 may be a shard server storing a “shard,” or certainrange of data, within a database system. Requests to a database system100 implementing a shard server are directed by the system to the serverstoring the particular range of data where the data in question would belocated.

The database system 100 may be arranged as a relational database, or asa non-relational database, such as the MongoDB database system offeredby MongoDB, Inc. of New York, N.Y. and Palo Alto, Calif. The databasesystem 100 includes a database 10 configured to store a primary copy ofthe database data. In a preferred embodiment, the database system 100 isa non-relational database system wherein the database 10 stores one ormore collections of documents allowing for dynamic schemas. A collectionis a group of documents that can be used for a loose, logicalorganization of documents. In such scenarios, a document is a collectionof attribute-value associations relating to a particular entity, and insome examples forms a base unit of data storage for the managed databasesystem. Attributes are similar to rows in a relational database, but donot require the same level of organization, and are therefore lesssubject to architectural constraints. In one example, the documentsfollows the known JSON format; in another, the documents may be storedas BSON documents. It should be appreciated, however, that the conceptsdiscussed herein are applicable to relational databases and otherdatabase formats, and this disclosure should not be construed as beinglimited to non-relational or “no-SQL” databases in the disclosedembodiments.

In one example, the database data may include logical organizations ofsubsets of database data. The database data may include index data,which may include copies of certain fields of data that are logicallyordered to be searched efficiently. Each entry in the index may consistof a key-value pair that represents a document or field (i.e., thevalue), and provides an address or pointer to a low-level disk blockaddress where the document or field is stored (the key). The databasedata may also include an operation log (“oplog”), which is achronological list of write/update operations performed on the datastore during a particular time period. The oplog can be used to rollback or re-create those operations should it become necessary to do sodue to a database crash or other error. Primary data, index data, oroplog data may be stored in any of a number of database formats,including row store, column store, log-structured merge (LSM) tree, orotherwise.

In other embodiments, the database system 100 is a relational databasesystem, and the database 10 stores one or more tables comprising rowsand columns of data according to a database schema.

The database 10 includes a plurality of records 20, 30 (e.g., documentsin a non-relational database) each storing a number of attribute-valuepairs. In one example, records 20, 30 may be employee records. Record 20may be configured to store the value “Smith” for the attributelast_name, and record 30 may be configured to store the value “Jones”for the attribute last_name.

The database 10 further includes an index 40, which stores a copy of anindex field 42, 46 for at least one of the plurality of records 20, 30.Index fields 22, 32 in records 20, 30 are used as the basis for creatingthe index 40. Thus, the index field 42, 46 corresponds to a valuecomprising or derived from the value of the attribute-value pair storedat index fields 22, 32 of the records 20, 30. The index 40 furthercomprises, for each index field 42, 46, a link 44, 48 to the record 20,30 represented by the index field 42, 46. To continue the example above,the index field 22 may be the last_name attribute having a value of“Smith” for document 20. A copy of the index field 22 (“Smith”) isstored as the index field 42 in index 40. A link 44 contains a link,address, or other identifier allowing the system to locate record 20.

It will be appreciated that the index 40 may be a composite indexstoring more than one index field for each record, with each index fieldrepresenting the value of an attribute-value pair stored in a record.Continuing the previous example, the index 40 containing the index field22 of last_name may contain an additional index field of first_name. Insuch an example, the index 40 may be optimized based on a particularindex field (typically the first column of the index).

The database system 100 further includes an index engine 50 configuredto populate the index 40 as discussed herein. The index engine 50generates the index with reference to a criteria field 24, 34 stored inrecords 20, 30. The index field 22, 32 of records 20, 30 may only bestored in the index 40 if the criteria fields 24, 34 of those recordssatisfy the criteria. For example, if the criteria field 24 of record 20were employee_status, the index engine 50 may be configured to includethe index field 22 of record 20 if the criteria field 24 (i.e.,employee_status) has a value of “retired.” The index engine 50 may beconfigured to create and/or update the index 40 on the occurrence ofcertain events (e.g., a write operation being performed on an indexeddocument or the index field stored therein) or after a certain amount oftime.

The database system 100 further includes a query engine 60 configured tointeract with the data in the database 10 by performing read and writeoperations. The query engine 60 selectively refers to the index 40 inorder to perform database operations as efficiently as possible. Forexample, where the index field 42 in index 40 represents the last_nameattribute, a query on the field last_name may use the index 40 toquickly locate any records satisfying the query criteria. The queryengine 60 may follow the link stored in any such index entries to accessthe responsive records.

According to another example, if a document which matches thepartialFilterExpression contains none of the fields specified in anindex key pattern, then the system inserts an index key into the indexwith null values for the missing fields:

> db.c.drop( ) true > db.c.createIndex({a: 1}, {partialFilterExpression:{b: {$gt: 3}}}) {  ″createdCollectionAutomatically″ : true, ″numIndexesBefore″ : 1,  ″numIndexesAfter″ : 2,  ″ok″ : 1 } >db.c.insert({b: 99}) WriteResult({ ″nInserted″ : 1 }) > db.c.find().hint({a: 1}).returnKey( ) { ″a″ : null }

It will be appreciated that the index engine 50 and the query engine 60are shown as standalone components for illustration purposes only. Insome embodiments, the index engine 50 and the query engine 60 areincorporated in a database application 20 that handles data requests,manages data access, and performs background management operations forthe database system 100. The database application 20 is configured tointeract with various components of the database system 100, includingat least one storage engine (not shown) for writing data to the database10.

An exemplary collection of records 210-240 and corresponding index 250is shown in FIG. 2. In this example, each of the records 210-240contains information about an individual employee, including full name,address, and employment status. In this example, the fields last_nameand address_state are used as the index fields 212, 214, and theemployee_status field is used as the criteria field 216. The criteriafield 216 may be used to filter which records are to be included in theindex 250, having records 252 “Delaney,” 254 “Fetty,” and 256 “Tidwell.”In this example, the index 250 has been generated using the criteriathat a record should be included in the index 250 if the employee_statusfield (i.e., criteria field 216) for that record has a value of“active.” In this example, the records 220, 230, 240 are included in theindex, since their criteria fields 216 satisfy the criteria forinclusion in the index—that is, the employees identified in each ofthose records has an employee_status of “active.” By contrast, record210 is not included because the associated employee has anemployee_status of “retired.”

Criteria for inclusion in an index may be provided through a databasecommand, for example, at the time the index is created. As an example,the following code may create the index 250 discussed in the exampleabove:

-   -   db.inventory.createIndex({last_name: 1, address_state: 1},        {partialFilterExpression: {employee_status: “active”}})

Criteria may be provided in any number of formats, including logicaloperators, comparison operators, and the like. For example, thefollowing pseudocode may create an index of the last names of employeeswhose salary is greater than $50,000 a year:

-   -   db.inventory.createIndex({last_name: 1},        {partialFilterExpression: {salary: {$gt: 50000}}})

Similarly, the following code may create an index of the last names ofemployees whose home address is not located in Connecticut:

-   -   db.inventory.createIndex({last_name: 1},        {partialFilterExpression: {address_state: {$ne: “CT”}}})

In addition to the textual criteria examples given here, non-textualcriteria may also be provided. For example, geographic criteria may beprovided for records that include geolocation information, such aslatitude and longitude coordinates. The criteria may be expressed interms of distance from a particular location as expressed by thecoordinates, or may be expressed in terms of a location's inclusion orexclusion from a geometric shape centered on a geographic location(e.g., a circle having a 30 mile radius around a particular city, or apolygon that circumscribes a particular voting district).

A process 300 of generating a partial index for a database using acriteria field is shown in FIG. 3.

At step 310, process 300 begins.

At step 320, at least one index field to be included in the databaseindex is received, and at step 330, a criteria field and a criteriacondition are received, with the criteria field not included in the atleast one index field (i.e., the criteria field is not one of the indexfields). The criteria may be received in textual form as part of afunction call or command to create the index, For example, a“createIndex” function may be called, with the index fields(s), criteriafield, and criteria condition passed to the function as parameters. Inone example, the criteria field and criteria condition are provided inthe form of a logical expression. Such a logical expression mayindicate, for example, that a record should be included in the index if“salary>=50000,” “status NOT NULL,” or the like. In other embodiments, agraphical interface may be provided through which a user may define theindex by graphically selecting index fields, criteria fields, criteriaconditions, and the like.

At optional step 340, in a preferred embodiment in which anon-relational database is employed, at least one document configured tostore the index field and/or criteria field is identified. Innon-relational database systems, there is typically not strictenforcement of a schema indicating what fields must be stored by records(in this case, documents). In such embodiments, it may therefore benecessary to determine if a document potentially to be included in theindex includes the index field and/or criteria field. In someembodiments, the absence of one or both of the fields from the documentcauses that document to be excluded from further consideration for theindex, since there is no way to determine if the document should beincluded (due to the lack of a criteria field) or to include it in theindex (due to the lack of an index field).

At step 350, an index is generated. Each entry in the index includes afield based on the at least one index field from at least one record ofthe plurality of records that satisfies the criteria condition, as wellas a pointer to the at least one record of the plurality of records. Inone embodiment, the index may be stored as a B-tree, B+ tree, orvariation thereof, with the index being searchable by traversing theleaves of the tree structure. B-trees and their variants are describedin “The Ubiquitous B-Tree” by Douglas Comer (Computing Surveys, Vol. 11,No. 2, June 1979), which is hereby incorporated by reference in itsentirety. In other embodiments, the index is stored in an array or otherstructure known in the art. The record pointer stored in each entry inthe index may be an address or pointer to a low-level disk block addresswhere the record is stored. The index may be ordered in a way thatfacilitates the rapid location and retrieval of the indexed records. Forexample, the entries of the index may be ordered alphabeticallyaccording to the index field (or the first index field if multiple indexfields are included).

The index field stored in an entry in the index may be an exact copy ofthe corresponding index field from the record, or may be a function ofthe index field from the record. In some embodiments, such a functionmay be applied to the index field to ensure data consistency or improvedperformance of the index and/or database. For example, the index fieldfrom the record may be converted to all uppercase letters in the index.During a database operation, the index field in a record may also beconverted to all uppercase letters before being compared to the indexentry, thereby simplifying the logic and processing necessary forperforming the comparison.

In some embodiments, it may be desirable to rearrange the index field tomore evenly distribute entries in the index. For example, a B-tree indexbuilt on an incrementally-created ID_number would be unbalanced, as eachnew ID_number would be inserted to the right of the previous entry. Sucha scheme may also cause contention for memory space, as the entriescreated during a particular time would be stored in adjacent memorylocations. Subsequent entries must be queued to access this area ofmemory, potentially slowing down performance of the system. In oneexample, the index field may be rearranged by reversing the order of thetext within. In such an example, the number “123456” would be stored as“654321,” whereas the next number “123457” would be stored as “754321”in a different portion of the index than “654321,” given the differentfirst digits. Other schemes for minimizing data clusters may also beimplemented.

At step 360, process 300 ends.

The embodiments shown and discussed with respect to FIGS. 1 and 2 depicta single database system. Yet in some embodiments, multiple storagenodes may be provided and arranged in a replica set, such as theembodiments described in U.S. patent application Ser. No. 12/977,563,which is hereby incorporated by reference in its entirety. FIG. 4 showsa block diagram of an exemplary replica set 410. Replica set 410includes a primary node 420 and one or more secondary nodes 430, 440,450, each of which is configured to store a dataset that has beeninserted into the database. As discussed in more detail below, separateindexes may be created for the same database on different nodes, withthe indexes structured or optimized according to the node's role in thereplica set as a primary node or a secondary node. For example,regionally distributed nodes can each have its own partial index builton the same primary field, but that implements different filterconditions related to the position of the respective node.

The primary node 420 may be configured to store all of the documentscurrently in the database, and may be considered and treated as theauthoritative version of the database in the event that any conflicts ordiscrepancies arise, as will be discussed in more detail below. Whilethree secondary nodes 430, 440, 450 are depicted for illustrativepurposes, any number of secondary nodes may be employed, depending oncost, complexity, and data availability requirements. In a preferredembodiment, one replica set may be implemented on a single server, or asingle cluster of servers. In other embodiments, the nodes of thereplica set may be spread among two or more servers or server clusters.

The primary node 420 and secondary nodes 430, 440, 450 may be configuredto store data in any number of database formats or data structures asare known in the art. In a preferred embodiment, the primary node 420 isconfigured to store documents or other structures associated withnon-relational databases. The embodiments discussed herein relate todocuments of a document-based database, such as those offered byMongoDB, Inc. (of New York, N.Y. and Palo Alto, Calif.), but other datastructures and arrangements are within the scope of the disclosure aswell.

In one embodiment, both read and write operations may be permitted atany node (including primary node 420 or secondary nodes 430, 440, 450)in response to requests from clients. The scalability of read operationscan be achieved by adding nodes and database instances. In someembodiments, the primary node 420 and/or the secondary nodes 430, 440,450 are configured to respond to read operation requests by eitherperforming the read operation at that node or by delegating the readrequest operation to another node (e.g., a particular secondary node430). Such delegation may be performed based on load-balancing andtraffic direction techniques known in the art.

In some embodiments, the database only allows write operations to beperformed at the primary node 420, with the secondary nodes 430, 440,450 disallowing write operations. In such embodiments, the primary node420 receives and processes write requests against the database, andreplicates the operation/transaction asynchronously throughout thesystem to the secondary nodes 430, 440, 450. In one example, the primarynode 420 receives and performs client write operations and generates anoplog. Each logged operation is replicated to, and carried out by, eachof the secondary nodes 430, 440, 450, thereby bringing those secondarynodes into synchronization with the primary node 420.

In some embodiments, the primary node 420 and the secondary nodes 430,440, 450 may operate together to form a replica set 410 that achieveseventual consistency, meaning that replication of database changes tothe secondary nodes 430, 440, 450 may occur asynchronously. When writeoperations cease, all replica nodes of a database will eventually“converge,” or become consistent. This may be a desirable feature wherehigher performance is important, such that locking records while anupdate is stored and propagated is not an option. In such embodiments,the secondary nodes 430, 440, 450 may handle the bulk of the readoperations made on the replica set 410, whereas the primary node 430,440, 450 handles the write operations. For read operations where a highlevel of accuracy is important (such as the operations involved increating a secondary node), read operations may be performed against theprimary node 420.

Given these different roles of the primary node 420 and the secondarynodes 430, 440, 450, it may be desirable to create, for each database,an index on each node suitable for that node's role in the replica set.For example, because a primary node 420 may primarily perform writeoperations, an index may be created on a unique system-generatedidentifier that can be used to quickly locate the most recently-addedrecord in order to find a memory location for a new record. By contrast,the performance of a secondary node 430 that performs repeated querieson a particular field (e.g., a last name of an employee) may be improvedby creating an index based on that last name. In some embodiments,indexes may therefore be created for each node, with each index possiblyincluding different index fields. In other embodiments, multiple indexfields may be stored in each index, but the ordering of the columnsand/or the data within the columns may be optimized by node. The indexin the previous example may include the unique system-generatedidentifier and the employee's last name on each node, but the order ofthose columns, and the order in which the entries are sorted, may varyby node.

In such an environment, indexes may be created for a particular databaseon each node of the replica set, with the indexes on different nodesbeing optimized according to the data characteristics and role of thenode. In one embodiment, secondary nodes are primarily responsible forresponding to read requests. The secondary nodes may also be tasked withaggregation operations with high frequency, thus an index on thespecific field, filtered on fields outside the index and tailored toselect documents of interest (e.g., that are not written to with anyfrequency, that specify a status of interest, that are the basis ofaggregation operations (e.g., aggregations operations process datarecords and return computed results—aggregation operations group valuesfrom multiple documents together, and can perform a variety ofoperations on the grouped data to return a single result), etc.) therebyfacilitates faster retrieval and processing of aggregation operation.Additionally, the partial index created is more compact that anconventional index and may be accessed and maintained in memory withoutthe same burden on the system as a conventional index.

In another embodiment, primary nodes in a replica set are typicallyresponsible for write operations, thus a partial index that includes abroad index term filtered on record fields that are frequently writtenprovides an index optimized to the role of the primary node without thedrawback of indexing the field that is frequently written—which wouldrequire frequent regeneration of the index.

It will be appreciated that the difference between the primary node 420and the one or more secondary nodes 430, 440, 450 in a given replica setmay be largely the designation itself and the resulting behavior of thenode; the data, functionality, and configuration associated with thenodes may be largely identical, or capable of being identical. Thus,when one or more nodes within a replica set 410 fail or otherwise becomeavailable for read or write operations, other nodes may change roles toaddress the failure. For example, if the primary node 420 were to fail,a secondary node 430 may assume the responsibilities of the primarynode, allowing operation of the replica set to continue through theoutage. This failover functionality is described in U.S. applicationSer. No. 12/977,563, the disclosure of which has been incorporated byreference.

Each node in the replica set 410 may be implemented on one or moreserver systems. Additionally, one server system can host more than onenode. Each server can be connected via a communication device to anetwork, for example the Internet, and each server can be configured toprovide a heartbeat signal notifying the system that the server is upand reachable on the network. Sets of nodes and/or servers can beconfigured across wide area networks, local area networks, intranets,and can span various combinations of wide area, local area and/orprivate networks. Various communication architectures are contemplatedfor the sets of servers that host database instances and can includedistributed computing architectures, peer networks, virtual systems,among other options.

The primary node 420 may be connected by a LAN, a WAN, or otherconnection to one or more of the secondary nodes 430, 440, 450, which inturn may be connected to one or more other secondary nodes in thereplica set 410. Connections between secondary nodes 430, 440, 450 mayallow the different secondary nodes to communicate with each other, forexample, in the event that the primary node 420 fails or becomesunavailable and a secondary node must assume the role of the primarynode.

Example Indexes

Illustrated in FIG. 8 is a query that selects and orders the matchingdocuments using an index. According to various embodiments, indexes inMongoDB are similar to indexes in other database systems. In oneexample, indexes are defined on the database system at the collectionlevel and the database system is configured to support indexes on anyfield or sub-field of the documents in a collection.

Collections within the database are a grouping of documents. Acollection is similar to an relational database table, however,collections do not enforce a schema. Documents within a collection canhave different fields. Typically, documents in a collection have asimilar or related purpose. Documents define the basic unit of datastorage in the database system. According to various embodiments,documents are analogous to JSON objects but are configured in thedatabase in a more type-rich format known as BSON. According to variousembodiments, indexes can be implemented with a B-tree data structure,although other embodiments can implement other data structures to storeand/or access index data (e.g., row store, column store, log-structuredmerge (LSM) tree, etc.).

In some embodiment, the system can be configured to create and access anumber of different index types to support specific types of data andqueries. For example, single field indexes can be defined to supportuser-defined ascending/descending indexes on a single field of adocument. In other examples, compound indexes can be created, as well asmulti-key indexes (e.g., to index the content stored in arrays). Variousembodiments can implement geo-spatial indexes (e.g., planar geometry andspherical geometry based indexes), text indexes (e.g., for search forstring content in a collection), and/or hashed indexes (e.g., to supporthash based sharding). Each of these indexes can be augmented by filtercriteria to generate partial index versions of each index type.

As discussed above, partial indexes only index the documents in acollection that meet a specified filter expression. For example, byindexing a subset of the documents in a collection, partial indexes havelower storage requirements and reduced performance costs for indexcreation and maintenance. Reducing the storage required improvesexecution efficiency for the partial index and therefore the database asa whole. In further embodiments, smaller footprints (i.e., storage size)translates into the ability to better maintain partial indexes in memoryfor faster execution.

According to one embodiment, users can create a partial via a commandline interface, may do so via a graphical tool, or other management toolby entering a create index command accompanied by a filter expression(e.g., via a db.collection.createIndex( )method with apartialFilterExpression option). According to one example, apartialFilterExpression option is configured to accept as an input adocument that specifies the filter condition using:

-   -   equality expressions (i.e. field: value or using the $eq        operator),    -   $exists: a true test expression,    -   $gt (greater than), $gte (greater than or equal), $lt (less        than), $lte (less than or equal) expressions,    -   $type expressions (e.g., specifies a BSON type),    -   $and operator (e.g., logical AND operation on an array of two or        more expressions (e.g. <expression1>, <expression2>, etc.) and        selects the documents that satisfy all the expressions in the        array) (some embodimetns, may limit $and to the top-level only).

For example, the following operation creates a compound index thatindexes only the documents with a rating field greater than 5.

db.restaurants.createIndex(  { cuisine: 1, name: 1 },  {partialFilterExpression: { rating: { $gt: 5 } } } )

According to various embodiment, the system is configured to acceptspecification of a filter expression (e.g., via apartialFilterExpression command option) for all defined index types. Insome environments, the system is configured to determine if querymatches an existing partial index as a condition predicate. In otherwords the system is configured to use partial indexes when the query orsort option matches the filter condition or is a subset of the filtercondition specified in the partial index. According to one embodiment,the system is configured not to use a partial index for a query or sortoperation if using the index results in an incomplete result set.

According to various embodiments, to use a partial index in resolving aquery, the system first determines that a query contains the filterexpression (or a modified filter expression that specifies a subset ofthe filter expression) as part of its query condition.

For example, given the following index:

db.restaurants.createIndex(  { cuisine: 1 },  { partialFilterExpression:{ rating: { $gt: 5 } } } )

The following query can use the index since the query predicate includesthe condition rating: {$gte:8} that matches a subset of documentsmatched by the index filter expression ratings: {$gt: 5}:

-   -   db.restaurants.find({cuisine: “Italian”, rating: {$gte: 8} })

However, to provide a contrary example, the following query cannot usethe partial index on the cuisine field because using the index resultsin an incomplete result set. Specifically, the query predicate includesthe condition rating: {$lt: 8} while the index has the filter rating:{$gt: 5}. That is, the query {cuisine:“Italian”, rating: {$lt: 8} }matches more documents (e.g. an Italian restaurant with a rating equalto 1) than are indexed.

-   -   db.restaurants.find({cuisine: “Italian”, rating: {$lt: 8} })

The system is configured to detect this inconsistency and ignore anypartial index that fails to cover the conditions of the query. In oneembodiment, the system is configured to project the data needed torespond to the query and evaluate the data needed against the index todetermine if the index may be used. For example, the index may be storedas a B-tree and the system can project the data needed to respond to thequery and scan or search the tree to determine coverage. Thus variousembodiments as part of processing partial indexes are configured todetermine that the search query is guaranteed to match a set ofdocuments that is contained within the subset of documents before usingthe partial index in generating a set of results.

Similarly, in another embodiment, the system can determine that thefollowing query cannot use the partial index because the query predicatedoes not include the filter expression and using the index would returnan incomplete result set.

-   -   db.restaurants.find({cuisine: “Italian” })

According to some embodiments, the system is configured to determinethat the query predicate matches the filter expression of the partialindex or resolves into a subset of the filter expression of the partialindex before using the partial index to respond to a query.

As discussed, partial indexes determine the index entries based on thespecified filter. The filter can include fields other than the indexkeys and can specify conditions other than just an existence check. Forexample, a partial index can implement:

db.contacts.createIndex(  { name: 1 },  { partialFilterExpression: {name: { $exists: true } } } )

This partial index example supports queries based on an existence filtermade on the name field (contained in the index). However, the system isalso configured to resolve a partial index that specifies a filterexpressions on fields other than the index key. For example, thefollowing operation creates a partial index, where the index is on thename field but the filter expression is on the email field:

db.contacts.createIndex(  { name: 1 },  { partialFilterExpression: {email: { $exists: true } } } )

In order for the system to executed based on accessing the partialindex, a query optimizer function must evaluate the query predicate anddetermined that the result includes a non-null match on the email fieldas well as a condition on the name field.

For example, the following query can use the index above (i.e., thesystem is configured to determine that the query conditions match thefilter conditions of the partial index (matching including generate asubset of the filter conditions):

-   -   db.contacts.find({name: “xyz”, email: {$regex: A.org$/} })

However, a contrary examples includes the following query that cannotuse the partial index above:

-   -   db.contacts.find({name: “xyz”, email: {$exists: false} })

Some embodiments implement restrictions on partial indexes that arecreated. For example, the system restricts creation of multiple versionsof an index that differ only in the options. As such, the system canrestrict creation of multiple partial indexes that differ only by thefilter expression.

Example: Partial Index for a Collection

Consider a collection restaurants containing documents that resemble thefollowing:

{  ″_id″ : ObjectId(″5641f6a7522545bc535b5dc9″),  ″address″ : {  ″building″ : ″1007″,   ″coord″ : [    −73.856077,    40.848447   ],    ″street″ : ″Morris Park Ave″,   ″zipcode″ : ″10462″  },  ″borough″ :″Bronx″,    ″cuisine″ : ″Bakery″,  ″rating″ : { ″date″ :ISODate(″2014-03-03T00:00:00Z″),     ″grade″ : ″A″,     ″score″ : 2     },  ″name″ : ″Morris Park Bake Shop″,  ″restaurant_id″ : ″30075445″}

The system is configured to accept definition of a partial index on theborough and cuisine fields, for example specifying a filter for choosingonly to index documents where the rating.grade field is A:

db.restaurants.createIndex(  { borough: 1, cuisine: 1 },  {partialFilterExpression: { ′rating.grade′: { $eq: ″A″ } } } )

Then, the following query on the restaurants collection is resolved bythe system using the partial index to return the restaurants in theBronx with rating.grade equal to A:

-   -   db.restaurants.find({borough: “Bronx”, ‘rating.grade’: “A”})

In contrast, the system is configured to determine that the followingquery cannot use the partial index—because the query expression does notinclude the rating.grade field:

-   -   db.restaurants.find({borough: “Bronx”, cuisine: “Bakery” })

Example: Partial Index with Unique Constraint

According to various embodiments, the system is configured to resolvepartial indexes on the documents in a collection that meet a specifiedfilter expression. In another embodiment, the system enables users tospecify both a partialFilterExpression and a unique constraint on thepartial index. The unique constraint is configured to ensure that thefields (including indexed fields) do not store duplicate values (i.e.the system enforces uniqueness for the indexed fields responsive to aunique constraint). In some examples, the system is configured toresolve the unique constraint on the documents that meet the filterexpression. Thus, a partial index with a unique constraint does notprevent the insertion of documents that do not meet the uniqueconstraint, if the documents do not meet the filter criteria.

For example, with a collection named users that contains the followingdocuments:

-   -   {“_id”: ObjectId(“56424f1efa0358a27fa1f99a”), “username”:        “david”, “age”: 29}    -   {“_id”: ObjectId(“56424f37fa0358a27fa1f99b”), “username”:        “amanda”, “age”: 35}    -   {“_id”: ObjectId(“56424fe2fa0358a27fa1f99c”), “username”:        “rajiv”, “age”: 57}

The following operation is executed by the system to create an indexthat specifies a unique constraint on the username field and a partialfilter expression age: {$gte: 21}.

db.users.createIndex(  { username: 1 },  { unique: true,partialFilterExpression: { age: { $gte: 21 } } } )

According to one embodiment, the system is configured to resolve theindex such that the index prevents the insertion of the followingdocuments (since documents already exist with the specified usernamesand the age fields are greater than 21):

-   -   db.users.insert({username: “david”, age: 27})    -   db.users.insert({username: “amanda”, age: 25})    -   db.users.insert({username: “rajiv”, age: 32})

However, the following documents with duplicate usernames are allowed bythe system (since the unique constraint only applies to documents withage greater than or equal to 21).

-   -   db.users.insert({username: “david”, age: 20})    -   db.users.insert({username: “amanda” })    -   db.users.insert({username: “rajiv”, age: null})

The various processes described herein can be configured to be executedon the systems shown by way of example in FIGS. 1, 2, and 4. The systemsand/or system components shown can be programmed to execute theprocesses and/or functions described.

Additionally, other computer systems can be configured to perform theoperations and/or functions described herein. For example, variousembodiments according to the present invention may be implemented on oneor more computer systems. These computer systems may be, speciallyconfigured, computers such as those based on Intel Atom, Core, orPENTIUM-type processor, IBM PowerPC, AMD Athlon or Opteron, SunUltraSPARC, or any other type of processor. Additionally, any system maybe located on a single computer or may be distributed among a pluralityof computers attached by a communications network.

A special-purpose computer system can be specially configured asdisclosed herein. According to one embodiment of the invention thespecial-purpose computer system is configured to perform any of thedescribed operations and/or algorithms. The operations and/or algorithmsdescribed herein can also be encoded as software executing on hardwarethat defines a processing component, that can define portions of aspecial purpose computer, reside on an individual special-purposecomputer, and/or reside on multiple special-purpose computers.

FIG. 5 shows a block diagram of an example special-purpose computersystem 500 on which various aspects of the present invention can bepracticed. For example, computer system 500 may include a processor 506connected to one or more memory devices 510, such as a disk drive,memory, or other device for storing data. Memory 510 is typically usedfor storing programs and data during operation of the computer system500. Components of computer system 500 can be coupled by aninterconnection mechanism 508, which may include one or more busses(e.g., between components that are integrated within a same machine)and/or a network (e.g., between components that reside on separatediscrete machines). The interconnection mechanism enables communications(e.g., data, instructions) to be exchanged between system components ofsystem 500.

Computer system 500 may also include one or more input/output (I/O)devices 502-504, for example, a keyboard, mouse, trackball, microphone,touch screen, a printing device, display screen, speaker, etc. Storage512 typically includes a computer readable and writeable nonvolatilerecording medium in which computer executable instructions are storedthat define a program to be executed by the processor or informationstored on or in the medium to be processed by the program.

The medium can, for example, be a disk 602 or flash memory as shown inFIG. 6. Typically, in operation, the processor causes data to be readfrom the nonvolatile recording medium into another memory 604 thatallows for faster access to the information by the processor than doesthe medium. This memory is typically a volatile, random access memorysuch as a dynamic random access memory (DRAM) or static memory (SRAM).According to one embodiment, the computer-readable medium comprises anon-transient storage medium on which computer executable instructionsare retained.

Referring again to FIG. 5, the memory can be located in storage 512 asshown, or in memory system 510. The processor 506 generally manipulatesthe data within the memory 510, and then copies the data to the mediumassociated with storage 512 after processing is completed. A variety ofmechanisms are known for managing data movement between the medium andintegrated circuit memory element and the invention is not limitedthereto. The invention is not limited to a particular memory system orstorage system.

The computer system may include specially-programmed, special-purposehardware, for example, an application-specific integrated circuit(ASIC). Aspects of the invention can be implemented in software,hardware or firmware, or any combination thereof. Although computersystem 500 is shown by way of example, as one type of computer systemupon which various aspects of the invention can be practiced, it shouldbe appreciated that aspects of the invention are not limited to beingimplemented on the computer system as shown in FIG. 4. Various aspectsof the invention can be practiced on one or more computers having adifferent architectures or components than that shown in FIG. 5.

It should be appreciated that the invention is not limited to executingon any particular system or group of systems. Also, it should beappreciated that the invention is not limited to any particulardistributed architecture, network, or communication protocol.

Various embodiments of the invention can be programmed using anobject-oriented programming language, such as Java, C++, Ada, or C#(C-Sharp). Other programming languages may also be used. Alternatively,functional, scripting, and/or logical programming languages can be used.Various aspects of the invention can be implemented in a non-programmedenvironment (e.g., documents created in HTML, XML or other format that,when viewed in a window of a browser program, render aspects of agraphical-user interface (GUI) or perform other functions). The systemlibraries of the programming languages are incorporated herein byreference. Various aspects of the invention can be implemented asprogrammed or non-programmed elements, or any combination thereof.

Various aspects of this invention can be implemented by one or moresystems similar to system 700 shown in FIG. 7. For instance, the systemcan be a distributed system (e.g., client server, multi-tier system)that includes multiple special-purpose computer systems. In one example,the system includes software processes executing on a system associatedwith hosting database services, processing operations received fromclient computer systems, interfacing with APIs, receiving and processingclient database requests, routing database requests, routing targeteddatabase request, routing global database requests, determining global arequest is necessary, determining a targeted request is possible,verifying database operations, managing data distribution, replicatingdatabase data, migrating database data, etc. These systems can alsopermit client systems to request database operations transparently, withvarious routing processes handling and processing requests for data as asingle interface, where the routing processes can manage data retrievalfrom database partitions, merge responses, and return results asappropriate to the client, among other operations.

There can be other computer systems that perform functions such ashosting replicas of database data, with each server hosting databasepartitions implemented as a replica set, among other functions. Thesesystems can be distributed among a communication system such as theInternet. Various replication protocols can be implemented, and in someembodiments, different replication protocols can be implemented, withthe data stored in the database replication under one model, e.g.,asynchronous replication of a replica set, with metadata serverscontrolling updating and replication of database metadata under astricter consistency model, e.g., requiring two phase commit operationsfor updates.

FIG. 7 shows an architecture diagram of an example distributed system700 suitable for implementing various aspects of the invention. Itshould be appreciated that FIG. 7 is used for illustration purposesonly, and that other architectures can be used to facilitate one or moreaspects of the invention.

System 700 may include one or more specially configured special-purposecomputer systems 704, 706, and 708 distributed among a network 702 suchas, for example, the Internet. Such systems may cooperate to performfunctions related to hosting a partitioned database, managing databasemetadata, monitoring distribution of database partitions, monitoringsize of partitions, splitting partitions as necessary, migratingpartitions as necessary, identifying sequentially keyed collections,optimizing migration, splitting, and rebalancing for collections withsequential keying architectures.

Having thus described several aspects and embodiments of this invention,it is to be appreciated that various alterations, modifications andimprovements will readily occur to those skilled in the art. Suchalterations, modifications, and improvements are intended to be part ofthis disclosure, and are intended to be within the spirit and scope ofthe invention. Accordingly, the foregoing description is by way ofexample only.

Use of ordinal terms such as “first,” “second,” “third,” “a,” “b,” “c,”etc., in the claims to modify or otherwise identify a claim element doesnot by itself connote any priority, precedence, or order of one claimelement over another or the temporal order in which acts of a method areperformed, but are used merely as labels to distinguish one claimelement having a certain name from another element having a same name(but for use of the ordinal term) to distinguish the claim elements.

What is claimed is:
 1. A distributed database system, comprising: anon-relational database for storing a plurality of database documents,wherein the non-relational database includes a storage architectureconfigured to accept documents of dynamic architecture into one or morerespective collections of the documents; an index engine configured to:receive at least one index field, at least one index field value, acriteria field, and a criteria condition, wherein the criteria field isdifferent from the at least one index field; and generate a partialindex comprising the at least one index field and the at least one indexfield value from at least one document of the plurality of documents,wherein the index maps to any of a subset of documents within acollection that contain the criteria field and satisfy the criteriacondition, the subset including one or more documents within thecollection not containing the index field at identification; and a queryengine configured to: identify a search query containing the at leastone index field; determine that the search query is guaranteed to matcha set of documents that is contained within the subset of documents; andaccess the partial index to respond to the search query.
 2. The systemof claim 1, wherein the index engine is further configured to resolvethe criteria condition based on at least one of a comparison operator,arithmetic operator, bitwise operator, and operators on null.
 3. Thesystem of claim 1, wherein the system is configured to resolve thecriteria field and the criteria condition.
 4. The system of claim 3,wherein the system is configured to find a subset of documents thatcontain the criteria field and that satisfy the criteria condition. 5.The system of claim 4, wherein the system is configured to pass thesubset of documents for index generation.
 6. The system of claim 1,wherein the generation of the partial index is executed based on stages,including a filter stage and an index generation stage.
 7. The system ofclaim 1, wherein the partial index includes a key referencing the atleast one document of the plurality of documents.
 8. The system of claim1, wherein the plurality of database documents are a collection ofdocuments in a non-relational database.
 9. The system of claim 8,wherein the index engine is further configured to identify, in thecollection of documents, at least one document configured to store thecriteria field.
 10. The system of claim 1, wherein the index engine isfurther configured to resolve the criteria condition which comprises atleast a comparison operator and a comparison value.
 11. The system ofclaim 10, wherein the index engine is further configured to resolve thecriteria condition based on the comparison operator selected from agroup including at least a greater-than operator, a less-than operator,an equals operator, and a does-not-equal operator.
 12. The system ofclaim 1, wherein the index engine is further configured to resolve thecriteria condition wherein the criteria condition comprises a logicaloperator for determining whether the criteria field is set in the atleast one document of the plurality of documents.
 13. The system ofclaim 1, wherein at least one of the plurality of documents does notinclude the criteria field.
 14. A method for creating a database indexfor a plurality of database documents, comprising acts of: receiving atleast one index field to be included in the database index; receiving acriteria field and a criteria condition, wherein the criteria field isdifferent from the at least one index field; generating a partial indexcomprising the at least one index field and field value from at leastone document of the plurality of documents, wherein generating thepartial index comprises: determining that a subset of the plurality ofdocuments contain the criteria field and satisfy the criteria condition;and mapping the partial index to the subset of the plurality ofdocuments; identifying a search query containing the at least one indexfield; determining that the search query is guaranteed to match a set ofdocuments that is contained within the subset of documents to which theindex maps; and accessing the partial index to respond to the searchquery.
 15. The method of claim 14, further comprising identifyingdocuments that contain the criteria field and satisfy the criteriacondition.
 16. The method of claim 15, further comprising generating thepartial index from the identified documents.
 17. The method of claim 14,wherein the plurality of database documents are a collection ofdocuments in a non-relational database.
 18. The method of claim 17,further comprising identifying, in the collection of documents, at leastone document configured to store the criteria field.
 19. The method ofclaim 14, wherein the method further comprising resolving at least oneof a comparison operator, arithmetic operator, bitwise operator, andoperators on null to determine whether the criteria condition issatisfied.
 20. The method of claim 19, wherein the comparison operatoris selected from the group consisting of a greater-than operator, aless-than operator, an equals operator, and a does-not-equal operator.21. The method of claim 14, wherein the criteria condition comprises alogical operator for determining whether the criteria field is set inthe at least one document of the plurality of documents.